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ABSTRACT 

The first generation of redshifted 21 cm detection experiments, carried out with arrays like 
LOFAR, MWA and GMRT, will have a very low signal-to-noise ratio per resolution element 

, ( ^ 0.2). In addition, whereas the variance of the cosmological signal decreases on scales 

larger than the typical size of ionization bubbles, the variance of the formidable galactic fore- 

£N) grounds increases, making it hard to disentangle the two on such large scales. The poor sensi- 

tivity on small scales on the one hand, and the foregrounds effect on large scales on the other 

ON hand, make direct imaging of the Epoch of Reionization of the Universe very difficult, and de- 

tection of the signal therefore is expected to be statistical. Despite these hurdles, in this paper 
we argue that for many reionization scenarios low resolution images could be obtained from 

^] the expected data. This is because at the later stages of the process one still finds very large 

pockets of neutral regions in the IGM, reflecting the clustering of the large-scale structure, 
which stays strong up to scales of « 120 comoving Mpc (« 1°). The coherence of the 
emission on those scales allows us to reach sufficient S/N ( £ 3) so as to obtain reionization 
21 cm images. Such images will be extremely valuable for answering many cosmological 
questions but above all they will be a very powerful tool to test our control of the systematics 

• »-j in the data. The existence of this typical scale (s=y 120 h^ 1 comoving Mpc) also argues for 

/\ designing future EoR experiments, e.g., with SKA, with a field of view of at least 4°. 

H 
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1 INTRODUCTION 

During the Epoch of Reionization (EoR), gas in the Universe reion- 
ized after having been neutral for about 500 Myr during the so 
called Dark Ages. The EoR is thought to be caused by the first 
radiating sources, and its study is crucial to our understanding of 
the physics of these sources and how they influenced the forma- 
tion of later generations of astrophysical objects. Current observa- 
tional constraints indicate that the EoR occurred at 6.5 ^ z < 12, 
as inferred from SDSS high redshift quasar spectra (Fan, et al. 
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2003, 2006), WMAP (Page et al. 2007), SPT (Zahn et al. 2011) 
and IGM temperature measurements (Theuns et al. 2002; Bolton 
et al. 2010). In addition, recent HST observations of the Hubble 
Ultra Deep Field taken with the new Wide Field Camera 3 (WFC3) 
found a large sample of Lyman break galaxies at 7 ^ z < 10 (see 
e.g., Oesch et al. 2010; Bouwens et al. 2010; Bunker et al. 2010). 
These authors found that these galaxies do not produce enough ion- 
izing photons to account for the Universe's full reionization by red- 
shift 6 and concluded that an additional source of ionizing photons 
is required (Bouwens et al. 2011). Furthermore, measurements of 
the number of ionizing photons per baryon from Lyman a forest 
spectra at z « 6 yields a low number of such photons (~ 3), that 
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is, the reionization process is photon starved (Bolton & Haehnelt 
2007; Calverley et al. 201 1). When combined, these measurements 
lend themselves to the notion that reionization is a slow and drawn 
out process. 

Our current best hope to study this epoch in detail lies in obser- 
vations of the redshifted HI 21cm emission line (see e.g., Madau 
et al. 1997; Shaver et al. 1999; Furlanetto et al. 2006; Pritchard 
& Loeb 2011). To date, a number of experiments are planning to 
measure the EoR with the redshifted 21 cm line (e.g. LOFAR 1 , 
GMRT 2 , MWA 3 , 21CMA 4 , PAPER 5 ). These experiments seek sta- 
tistical detection of the cosmological 21cm signal, with the most 
widely studied such statistics being the rms and the power spectrum 
of the brightness temperature and their evolution with time (e.g. 
Morales & Hewitt (2004); Barkana & Loeb (2005); McQuinn et al. 
(2006); Bowman et al. (2006); Pritchard & Furlanetto (2007); Jelic 
et al. (2008); Harker et al. (2009a, 2010); Pritchard & Loeb (2008)). 
Though recently, Datta et al. (2012) have shown that one can im- 
age large bubbles of ionization around very high redshift powerful 
quasars with LOFAR. In particular, Jelic et al. (2008), Harker et al. 
(2010) and more recently Chapman et al. (2012) showed that de- 
spite the low signal-to-noise ratio, prominent foregrounds and in- 
strumental response, the 21 cm rms and power spectrum can be 
extracted from the data collected with the Low Frequency Array 
(LOFAR). Similar studies have been carried out for the MWA case 
(Geil etal. 2008, 2011). 

The current generation of telescopes are designed to detect 
the EoR statistically, rather than image it, for a number of reasons. 
On small scales the noise level per resolution element is relatively 
high. For example, at 150 MHz LOFAR will have a 56 mK sys- 
tem noise per resolution element (^ 3 arcmin) after 600 hours of 
observations with a 1 MHz bandwidth, corresponding to a signal- 
to-noise ratio of ~ 0.2 at these scales. On large scales there are two 
issues. The first one has to do with the typical sizes of ionized and 
neutral regions at each redshift, which limits the maximum scale at 
which smoothing of the data remains useful. It can also be shown 
that during the dark ages (no ionization bubbles) smoothing on very 
large scales does not help so much due to the low level of contrast 
and to the fact that beyond 1 degree the power due to cosmological 
fluctuations drops very quickly (see e.g., Santos et al. 2005; Jelic 
et al. 2008). The other, and potentially more severe, issue is that of 
the foregrounds which dominate the measurement on all scales and 
become even more prominent on large scales with power increas- 
ing as 6» 2 ' 5 ~ 3 (Tegmark et al. 2000; Giardino et al. 2002; Santos 
et al. 2005; Jelic et al. 2008, 2010; Bernardi et al. 2009, 2010). 
This means that on large scales the influence of these foregrounds 
will be harder to filter out. The common wisdom is that scales be- 
yond half a degree will remain inaccessible even after a number of 
years of observations. 

In this paper we argue that, despite the hurdles we have listed, 
imaging of the EoR on very large scales is possible with the 
current generation of telescopes. In presenting our case we rely 
on two arguments. Firstly, inspection of large scale EoR simula- 
tions ( 200 comoving Mpc) shows that towards the end of 
the reionization process one can still find sufficiently large neu- 
tral patches so as to allow imaging of this process after smooth- 
ing on sufficiently large scales. For example, a neutral patch to- 

1 Low Frequency Array, http://www.lofar.org/ 

2 Giant Metrewave Telescope, http://www.gmrt.ncra.tifr.res.in/ 

3 Murchison Widefield Array, http://www.haystack.mit.edu/ast/arrays/mwa/ 

4 21 Centimeter Array, http://web.phys.cmu.edu/~past/ 

5 Precision Array to Probe the EoR, http://astro.berkeley.edu/~dbacker/eor/ 



wards the end of reionization with a scale of roughly 120 comov- 
ing ft -1 comoving Mpc (1°) would have a signal-to-noise ratio 
per smoothing cell, after 600 hours of observations, of about 4 
since it would have about 20 x 20 independent resolution elements 
(l°/3' = 20). Notice that this argument would not work early in 
the reionization process since the variation in the 21 cm intensity 
in a given field, measured by radio interferometers, will be driven 
by the cosmological density fluctuation field, 8, which is rela- 
tively small - unless the spin temperature itself exhibits fluctuations 
above and below the CMB temperature (Pritchard & Furlanetto 
2007; Pritchard & Loeb 2008, 2010; Baek et al. 2010; Thomas & 
Zaroubi 201 1). Towards the later stages of the reionization process, 
however, the variations in the intensity are driven by the difference 
between neutral and ionized regions. The 120 h^ 1 comoving Mpc 
scale is driven mainly by the clustering scale of the Universe's 
large-scale structure, which the ionization sources, independent of 
their nature, tend to follow. 

Secondly, the current state-of-the-art foreground fitting meth- 
ods, such as Wp smoothing (Harker et al. 2010) or Independent 
Component Analysis (Chapman et al. 2012), do a very good job 
even on large scales, rendering them accessible for EoR analy- 
sis. The availability of such techniques together with the existence 
of the large-scale neutral patches towards the end of reionization, 
make it possible to image the EoR from 21 cm data on large scales. 

This paper is organized as follows. In section 2 we describe 
the cosmological signal and its basic equation. In section 3 we in- 
troduce the very large-scale simulations needed to demonstrate our 
argument. In these simulations we also include the influence of the 
telescope response, noise and foregrounds. In section 4 it is shown 
that, based on these large-scale simulations, imaging of the EoR on 
large scales is indeed possible with current instruments. We also ar- 
gue that the key to successful imaging on large scales is the ability 
to remove the foreground signal with sufficient accuracy on scales 
larger than 1 degree. The paper concludes with a summary and out- 
look (§ 6). 



2 COSMOLOGICAL 21 CM SIGNAL 

In radio astronomy, where the Rayleigh- Jeans law is usually ap- 
plicable, the radiation intensity, is expressed in terms of the 
brightness temperature, so that 

2^ 2 

I{u) = —k B T b , (1) 
c z 

where v is the radiation frequency, c is the speed of light and fcs is 
Boltzmann's constant (Rybicki & Lightman 1986). This in turn can 
only be detected differentially as a deviation from the Cosmic Mi- 
crowave Background (CMB) temperature, Tcmb- The predicted 
differential brightness temperature 8Tb = Tb — Tcmb, which re- 
flects the fact that the only meaningful brightness temperature mea- 
surement, insofar as the intergalactic medium (IGM) is concerned, 
is when it deviates from Tcmb- Derivation of 5Tb yields (Field 
1958, 1959; Madau et al. 1997; Ciardi & Madau 2003), 

sn , 28 „k q + *W, Hg)(jf + .)" 
x {om) \/(^r) (2) 

where h is the Hubble constant in units of 100 kms _1 Mpc _1 , 8 
is the mass density contrast, v r is the line-of-sight velocity com- 
ponent, xhi is the neutral fraction, fi m and Q& are the mass and 
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baryon densities in units of the critical density and H(z) is the Hub- 
ble parameter. Note that the three quantities, 8, v r , xhi and T a , are 
all functions of 3D position. 

Equation 2 shows that the differential brightness temperature 
is composed of a mixture of cosmology dependent and astrophysics 
dependent terms. The equation is clearly complex yet at the same 
time information rich. This is simply because at different stages in 
the evolution of reionization STt is dominated by different con- 
tributions. For example, at certain redshift ranges no significant 
ionization has taken place, i.e. xhi « 1 everywhere, yet there 
is enough heating to render T 3p i n 3> Tcmb- In such case, the 
brightness temperature is proportional to the density fluctuations 
making its measurement an excellent probe of cosmology. How- 
ever, at low redshifts (z £ 9) a significant fraction of the Universe 
is expected to be ionized and the measurement is dominated by 
the contrast between the neutral and ionized regions, hence, prob- 
ing the astrophysical source of ionization (see e.g., Iliev et al. 
(2008); Thomas et al. (2009); Thomas & Zaroubi (2011)). Dur- 
ing the stage at which reionization occurs it is safe to assume that 
T sptn »T C mb (Pritchard & Loeb 2008, 2010). 

Since interferometers do not measure the mean 8Tb but rather 
are sensitive to its fluctuations, it is easy to see that early in the 
reionization process, when the ionized regions are quite negligible, 
the rms of the measurement will be driven by S, the cosmological 
density contrast fluctuations (assuming T spi „ 3> Tcmb)- Later 
in the reionization process, however, when the typical size of the 
ionized regions becomes larger than the interferometer's resolution 
scale, the measured signal is driven by xhi, which increases the 
contrast and as a result the ability to observe the signal. 

In this paper we ignore the issue of calibration errors. This 
issue might become important for the real data (Datta et al. 2010), 
but so far the indication is that the calibration will be achieved with 
very high accuracy (Yatawatta et al. 2009; Kazemi et al. 2011) as 
shown recently by observation of two LOFAR fields (Yatawatta and 
Labropoulos, private communication). 



3 DATA SIMULATIONS 

Since here we argue that instruments such as LOFAR will enable 
imaging the EoR on large scales, it is important to test the influ- 
ence of the noise, instrument response, foreground extraction, and 
the distribution of the 21 cm signal power on various scales as a 
function of redshift. Hence one needs to explore scales well in ex- 
cess of w 120 h^ 1 comoving Mpc, namely, about 1 degree on 
the sky. This is the natural scale of the large-scale structure and is 
marked in the cosmological power spectrum by the turn over from 
P(k) oc k n to P(k) oc fc n ~ 4 , where the primordial power law in- 
dex n « 1. This scale is equal to the comoving horizon size at the 
era of equality between matter and radiation. 

Therefore, as a first step, we have to create very large-scale 
cosmological reionization simulations, from which a STt signal 
cube is produced (signal as a function of frequency). We then add 
to this cube a realistic model of foregrounds, expected instrumental 
response and (system) noise. 



3.1 Large-scale 21 cm simulations 

Full radiative transfer simulations on very large scales - in excess of 
200 h~ 1 comoving Mpc - are not available. Hence, recourse must 
be had to semi-analytical methods. Here we make use of two sets 



of simulations. The first simulations are produced using the pub- 
licly available package 21cmFAST (Mesinger et al. 2010; Mesinger 
& Furlanetto 2007; Zahn et al. 2007) which uses a semi-analytical 
approach to produce very large-scale simulations of the reioniza- 
tion process. See also Santos et al. (2010) who uses an alternative 
method to create fast large-scale reionization and redshifted 21 cm 
simulations, called simFast21. 

21cmFAST uses perturbation theory, the excursion set formal- 
ism, and analytic prescriptions to generate evolved 3D realizations 
of the density, ionization, peculiar velocity and spin temperature 
fields, which it then combines to compute the 21-cm brightness 
temperature. The method has been thoroughly tested against more 
accurate reionization codes (Mesinger et al. 2010). 

We produce three-dimensional 400 h^ 1 comoving Mpc 
simulation boxes of the brightness temperature from redshift 12 
down to redshift 6. The resolution of the simulation box is 
1 h^ 1 comoving Mpc. The simulations assume the standard 
WMAP cosmological parameters (Spergel et al. 2007). The sim- 
ulation box is binned with a 400 3 grid. The output of the code 
includes the spin temperature, the ionization fraction, the kinetic 
temperature and peculiar velocity field. These outputs are used by 
21cmFAST to create a brighness temperature box. For more de- 
tails please see Mesinger et al. (2010) and Mesinger & Furlanetto 
(2007). We then use the method developed by Thomas et al. (2009) 
to create an observational box of the 21 cm signal spanning the fre- 
quency range of 115 — 200 MHz. A slice through the signal cube 
along the frequency direction is shown in Fig. 1. In this simulation 
the ionization process reaches its mid point at z ~ 10. We will 
present a number of plots in the remainder of the paper at z = 9 
where the neutral fraction is 0.2. 

The second set of simulations we use to test the imaging pos- 
sibility is based on the EoR simulations program called bears 
(Thomas & Zaroubi 2008; Thomas et al. 2009). This scheme 
includes the physics of reionization in more detail relative to 
21cmFAST but needs cosmological simulations in order to create 
the reionization history, which makes it more time consuming than 
2 1 cmFAST. bears assumes a spherically symmetric ionization re- 
gion around each ionizing source but can easily allow for a wide 
range of sources with very different spectral energy distributions 
(SEDs). Here we use a dark matter simulation of 512 3 particles 
in a cube with comoving side length of 200 h~ comoving Mpc 
with WMAP standard parameters. The sides thus have twice the 
length of the simulations shown in Thomas et al. (2009) and used in 
our previous work on LOFAR EoR signal extraction (Harker et al. 
2009a,b). This leads to a minimum resolved halo mass of around 
3 x 10 10 h^ 1 M . Dark matter haloes are populated with sources 
whose properties depend on some assumed model. For this paper 
we explore both the 'quasar-type' and the stellar source models of 
Thomas & Zaroubi (2008); Thomas et al. (2009); Zaroubi et al. 
(2007). The topology and morphology of reionization is different 
in the two source models. We might expect quasar reionization to 
allow an easier detection than stellar reionization, since the regions 
where the sources are found are larger and more highly clustered, 
producing larger fluctuations in the signal. This is used here as a 
check on whether a completely different and more detailed sim- 
ulation scheme yields similar conclusions to the ones based on 
21 cmFAST simulations. 

Unfortunately, a full and detailed radiative transfer simulation 
on such a scale is not publicly available at this stage and we expect 
our conclusions to vary somewhat with higher resolution simula- 
tions. Still, the main results are expected to remain valid, as is in- 
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Figure 1. A slice of 5T}, along the frequency direction for the 400 h~ 1 comoving Mpc simulation obtained with 2 1 cmFAST. The redshift at which the box 
is half ionized is 10. We will show results from redshift 9 in this paper where the neutral fraction is 0.2. The color scale is shown to the right of the figure. 



deed seen in such a simulation that is currently being analyzed by 
Iliev & Mellema (private communication). 

3.2 Instrumental response 

In radio interferometry the measured spatial correlation of the elec- 
tric field between two interferometric elements (stations) i and k at 
a given time, t, is called the visibility and is given by (Taylor et al. 
1999; Thompson et al. 2001): 

V:* k (u,v;t) = J rj k {l,m;t)e 27Tj{ul+vm) dldm, (3) 

where Il' k is the observed intensity at frequency v observed by 
correlating stations i and k and j — y—1. The coordinates I and 
m are the projections (direction cosines) of the source in terms of 
the baseline 1 . The size of the station gives the resolution, i.e., mini- 
mum uv cell size, at which the uv-plane is covered. From this equa- 
tion it is evident that the observed visibility is basically the Fourier 
transform of the intensity measured at the coordinates u and v (uv- 
plane). Following Harker et al. (2010), we define a sampling func- 
tion, 

Sv(u,v) = j S D (u —iik,v' — Vk;t,v)dtdu'dv'(4) 

Vfc^pixel (u,v) 

which gives how a distribution of interferometer baselines sample 
Fourier space during the time of observation. Here 8 D is the Dirac 
delta function, k is a uv-track of a baseline. 

Obviously, Eq. 3 indicates that the sampling function depends 
on frequency as well as on the number of baselines and their dis- 
tribution, i.e., how the visibilities are distributed in the uv-plane. 
Here we assume that the uv coverage is the same at all frequen- 
cies. In practice, a uniform uv coverage at all frequencies could be 
achieved by ignoring all the the uv points that are only partially 
covered within the frequency range of interest. This would require 

1 We ignore the effect of the Earth's curvature, the so called w-projection. 



discarding some of the data - in the case of LOFAR one loses ap- 
proximately 20 per cent of the data. This has two effects: an in- 
crease the level of noise and a reduction in the resolution at high 
frequencies. However, the assumption here is that the uv coverage 
of the telescope is quite dense and complete. In practice, all the 
baselines data, including from the long ones, will be used to model 
and remove compact sources. 

In the case of LOFAR, to simulate our data in the uv plane we 
perform a two-dimensional Fourier transform on the image of the 
foregrounds and signal at each frequency, and multiply by a mask 
(the uv coverage) which is unity at grid points in Fourier space (uv 
cells) where S(u, v) > 0, and is zero elsewhere. 

3.3 Noise 

Assuming the noise in the measurement follows a Gaussian dis- 
tribution for each component, the uncertainty in the measurement 
of the visibility at a given uv-plane pixel is inverse proportional 
to VS. The noise realization in the uv-plane is created at each uv 
point by drawing from a complex Gaussian field with an rms pro- 
portional to 1/vS for all the cells within the mask down to a uv 
distance that is equivalent to 4 arcmin resolution. The realization in 
the uv-plane is done so as to fulfill the reality condition. At scales 
smaller than this we truncate the sampling. The truncation is per- 
formed in order to avoid the noise normalization being controlled 
by the very low number of samplings at the edge of our uv sam- 
pling, namely, at the limit in which the Poisson-noise can not be 
approximated by white Gaussian noise. The noise realization in the 
image plane can then be obtained by inverse Fourier transform the 
uv-plane noise. The overall normalization of the level of noise is 
chosen so that the noise images have an rms in the image-plane of 
56 mK on an image using 1 MHz bandwidth at 150 MHz at the res- 
olution limit. This is the rms expected from LOFAR after 600 hours 
of observation of one EoR window with one synthesized beam. The 
noise level depends on the system temperature which is assumed to 
be T sys = 140 + 60 x (v/300 MHz) -2 ' 55 K (Jelic et al. 2008). A 
much more detailed account of the calculation of noise levels and 
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Figure 2. A noise realization in the uv-plane for 600 hours of observation. 
The absolute value of the noise is drawn here and shown in K (and not mK). 
The rms of the noise in the image plane is about 56 mK. In order to convert 
from Jy to mK we assume that k = 2n/9. The truncation in the map is 
made at a radius corresponding to 4 arcmin. 

the effects of instrumental corruption for the LOFAR EoR project 
may be found in Labropoulos et al. (2009). 

Fig. 2 shows a noise realization used in the simulations. The 
circle beyond which there is a cut in the uv data corresponds to 
4 arcmin resolution at 150 MHz. The distribution of the LOFAR 
stations was chosen to ensure a constant noise at each scale for a 
measurement of the power spectrum. This is why the noise level 
per pixel increases as a function of uv radius. Remember that the 
experiment was optimized for a power spectrum measurement and 
not for imaging. 

3.4 Foregrounds and extraction 

As mentioned earlier, a very important ingredient to consider here 
is the accuracy of extracting the foregrounds, especially on large 
scales. This is because the foreground's power increases with scale 
and there is no guarantee that the extraction algorithm, which 
fits the foregrounds along the frequency direction at small spatial 
scales, will not leave any large-scale residuals. 

For the foregrounds we use the simulations of Jelic et al. 
(2008, 2010). These incorporate contributions from Galactic dif- 
fuse synchrotron and free-free emission, and supernova remnants. 
They also include unresolved extragalactic foregrounds from ra- 
dio galaxies and radio clusters. We assume, however, that point 
sources bright enough to be distinguished from the background, ei- 
ther within the field of view or outside it, have been removed well 
below the noise level from the data. Observations of foregrounds at 
150 MHz at low latitude (Bernardi et al. 2009, 2010) indicate that 
these simulations describe the properties of the diffuse foregrounds 
well. 

To test the foregrounds' influence, especially with the amount 
of noise in the data, we apply an extraction algorithm on mock 
data. The mock data include the simulated cosmological signal, the 



instrument response, noise and foregrounds. As mentioned earlier, 
calibration errors and other systematics are not taken into account 
in this simulation. 

In order to extract the foregrounds, we assume that they have 
no small-scale features along the frequency direction. Given what 
we know about the physical origin of the galactic and extragalac- 
tic foregrounds, this assumption is quite reasonable (Petrovic & 
Oh 2010). For the extraction we use the W p algorithm which is a 
non-parametric method that is very suitable for fitting the spectrally 
smooth foregrounds in EoR data sets. The method was developed 
for general cases by Machler (1995), and has been used by Harker 
et al. (2009b) as an algorithm for fitting EoR foregrounds. Briefly, 
the method is a penalized maximum likelihood algorithm that is de- 
signed to find the maximum likelihood fit for the data but penalizes 
relative change of curvature. That is so to say, the method finds 
the best fit curve to the data with minimum the smallest possible 
ruggedness. 

The W p method has shown very good results for fitting the 
foregrounds both in real space (image-plane) and in Fourier space 
(uv plane) for up to 100 h^ 1 comoving Mpc and its influence on 
the power spectrum statistic has been tested and shown not to be 
significant up to the scales of the simulation (Harker et al. 2010). 
Here however, we would like to test it on much larger scales than 
considered previously. In this paper, application of this method is 
done using uv-plane fitting which has shown slightly better results 
than image-plane fitting (Harker et al. 2010). 

One should note that using predetermined functions, e.g., 
polynomials, to fit the data might introduce systematics due to 
over- or under-fitting of the foregrounds. Hence, the use of more 
advanced non-parametric techniques is essential in this case (see 
e.g., Harker et al. 2009a; Chapman et al. 2012). 



4 RESULTS 

We plot, in Fig. 3, the standard deviation of the smoothed cosmo- 
logical signal (solid lines) and the smoothed noise field (dashed 
lines) as a function of redshift. The smoothing is done with a Gaus- 
sian kernel of 5, 10, 15, 20 and 25 arcmin. It is clear that for 
smoothing scales ^ 15 arcmin there is a redshift range in which 
the signal becomes larger or comparable to the noise. Obviously, 
if the cosmological signal is coherent on scales larger than the 
smoothing scales, these structures will show up as EoR features 
in the 21 cm maps. 

The fact that the EoR signal catches up with the noise at large 
smoothing scales is driven by the the existence of very large ion- 
ized and neutral patches. Regardless of their nature, the sources that 
drive the ionization bubbles follow the large-scale structure which 
has a natural scale of 120 comoving Mpc This sets the largest 
scale up to which one can still see coherent structures of neutral and 
ionized regions. 

Since 120 h~ comoving Mpc is the natural scale of the 
large-scale structure, one can ask why such smoothing does not 
facilitate imaging of the EoR at all redshifts? The answer is sim- 
ply that without the ionized regions the contrast within the map is 
driven by the underlying density. This of course assumes that one 
can ignore the fluctuations due to T sp i n , which a good assumption 
in the redshift range of 6-11.5 that LOFAR will probe, but not a 
good assumption at much higher redshifts . That is to say, if the 
brightness temperature fluctuations were driven by the cosmologi- 
cal density alone, then it would be more difficult, with LOFAR, to 
image the signal even on large scales. 
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redshift redshift 

Figure 3. The left panel shows the standard deviation of the signal (solid lines) and the noise (dashed lines) as a function of redshift for the 
400 h" 1 comoving Mpc simulation noise and signal. Here we assume 600 hours of observation. The black, red, blue, dark green, brown, cyan and pink 
solid lines are for instrument resolution (RJ 3 arcmin.), 5, 10, 15, 20, 25 and 30 arcmin resolutions, respectively. The right panel shows the signal-to-noise 
ratio for the same cases. Notice that with 15 arcmin resolution and above there is a redshift range in which the signal rms exceeds the noise rms. This high 
signal-to-noise region is typically centered around the redshift at which the IGM is 50% ionized. It is also worth noting that although the rms of the signal 
decreases with the smoothing scale, the decrease in the noise is even larger. 



Fig. 4 shows original and reconstructed maps of the simulated 
EoR signal in a 2.5° x 2.5° field of view at redshift 9. We note here 
that this is about a quarter of the LOFAR field of view assuming a 
single beam. We compare here the 20 arcmin Gaussian smoothed 
map shown in Panel B with the following cases: Panel C, with a 
noisy signal assuming 600 hours of integration with LOFAR but 
without including foreground effects. Panel D, noisy signal assum- 
ing 2400 hours of integration of the same field (half the noise level), 
still without the inclusion of the foreground effects. Panel E, the 
same map as in C, i.e., with noise added assuming 600 hours of in- 
tegration, but with inclusion of the foregrounds and their extraction 
with the W p fitting procedure (Harker et al. 2009a). The smooth- 
ing is done with a 20 arcmin Gaussian kernel. After smoothing the 
signal rms in map A is 2.6 mK. The noise level after 600 hours of 
integration in maps C and D is is 2. mK and 2.2 mK (0.9 mK of 
which are due foregrounds residual), respectively. The noise levels 
after 2400 hours of integration and 20 arcmin smoothing in maps E 
and F are 1.1 mK and 1.3 mK (0.7 mK of which due to foreground 
residuals), respectively. It is clear that after 600 hours of observa- 
tion, one has the ability, albeit a limited one, to map the EoR signal, 
and the contour map is dominated by the noise. However, Panel F 
shows that after 2400 hours of observation the noise influence drops 
significantly at this smoothing scale, and a more reliable map can 
be seen. This remains true even after inclusion of the foreground 
effect, that is, when traces of the foreground extraction are present 
on large scales. 

A visual inspection shows a clear similarity between Panel B 
and Panels C-F in Fig. 4. To quantify this similarity we use two 
different methods. The first method is the Pearson cross-correlation 
coefficient, p, calculated with the formula, 

9 v^iO*-(*» 3 v^(w-<w>) 3 ' 

where x% and yi are the value of the pixel, i, in the two maps. The 



Table 1. Spearman Correlation Coefficients between Map B and the other 
maps. 



Map 


C 


D 


E 


F 


P 


0.77 


0.68 


0.93 


0.80 



results of this calculation are shown in Table 1. The table shows the 
correlation coefficient between the maps C-F and the noise- and 
foreground-free map B from Fig. 4. The existence of foregrounds 
and their extraction in maps D and F clearly reduces the correlation. 
Also the higher noise in maps C and D results in smaller correlation 
coefficient. Still, the correlation coefficients shown in the table are 
very high in all cases. 

The other method we use to quantify the correlation between 
the maps is to inspect their phase information. This is done by 
Fourier transforming each map and then checking whether the 
Fourier space phases of the maps C-F correspond to those of the 
original map B. If this were done to all the points in Fourier space, 
then one would obtain no correlation between the phases. This is 
because the amplitudes of most points in the Fourier transform of 
maps C-F is dominated by numerical noise and contain no useful 
information. In Fig. 5 we show a log-log plot of the rank-ordered 
Fourier coefficient amplitudes of the five images. The solid black 
line is the one for the image shown in Panel B of Fig. 4, whereas 
the others are for the rest of the 20 arcmin smoothed images. Each 
line is normalized such that its maximum amplitude is one. All the 
lines show the same typical behavior where the amplitude of the 
first few hundred pixels is high but then it drops exponentially to 
slowly varying values (almost flat) which is typical white noise be- 
havior. Note that the number of significant coefficients is larger in 
maps C-F than in map B because the former contain a contribu- 
tion from the (correlated) system noise. The flatness of this part 
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Figure 4. EoR maps at redshift 9, with a (xhi) = 0.2 field of view of 
2.5° X 2.5° (note that the LOFAR field of view will be around 5° X 5°). 
Panel A shows the original simulated EoR map at full resolution. The map 
shown in Panel B is smoothed with a 20 arcmin (standard deviation) Gaus- 
sian kernel. Panel C shows the same map as in B but with noise added to 
it assuming 600 hours of observation with LOFAR. Panel D is the same 
as C, i.e., with 600 hours of observation noise, but here the foregrounds 
were added and then extracted with the W p fitting procedure (Harker et al. 
2009b). Panel E is the same as C but with half the noise level of the map 
in panel B (2400 hours of observation). Panel F is the same but with 2400 
hours of observation noise and foregrounds that were added and then ex- 
tracted with the W p fitting procedure. The contour levels are colour coded 
as shown in the colour table at the top of the figure. 



of the plots indicates that they are dominated by white noise. We 
would like to emphasize that this is not the system noise contribu- 
tion which typically has a much larger amplitude and is not white 
(see Fig. 2). Therefore, in order to compare the phases of the var- 
ious images, we only take into account the pixels that have values 
larger than 10 -4 times the maximum amplitude. 

Next, we plot the phases of the pixels with relatively high am- 
plitudes O 10~ 4 of the maximum amplitude). Each of the four 
panels of Fig. 6 plots the phases of the reconstructed images (maps 
C-F in Fig. 4) versus the phases of the original map (map B in 
Fig. 4). These plots are presented as density plots. A high corre- 
lation shows as high concentration of points (high contour values) 




irr 12 i i i i i J I 

10° 10' 10 2 10 3 10* 10 5 10 6 

PIypI Mi imhpr 

Figure 5. The rank-ordered Fourier space amplitude for each of the 5 im- 
ages, B-F, shown in Fig. 4. All curves are normalized with respect to their 
maximum amplitudes. The solid black line is plot for map B whereas the 
other curves show the Fourier space amplitudes of image C (blue dotted 
line), D (cyan dashed line), E (red dotted-dashed line) and F (magenta dou- 
ble dotted-dashed line). All u?>maps are dominated by the highest few 
hundred pixels. The rest of the Fourier space pixels are noise dominated 
as demonstrated by the sudden drop in the amplitudes and their almost flat 
slope thereafter. 



along the diagonal. In all the panels the correlation between the 
phases is obvious. The best correlation is clearly obtained in the 
lower left panel because map E has the lowest noise and is without 
foregrounds. The worst correlation, though still a very clear cor- 
relation, is obtained in the upper right panel because map D has 
high noise and still has some residuals from the subtraction of the 
foregrounds. 

We repeated the same procedure on the 
200 h" 1 comoving Mpc bears simulations and get very 
similar results. To cover the same angular size as the previ- 
ous simulation we tile the BEARS simulation box to reach 
400 h^ 1 comoving Mpc. The result of this simulation is shown 
in Fig. 7, where the left panel shows the original 20 arcmin. 
smoothed simulation assuming 2400 hours of Observation with 
LOFAR. The right panel shows the extracted image after adding 
noise and foregrounds also smoothed with 20 arcmin. Gaussian. 
The two maps are clearly very similar with a correlation coefficient 
of ~ 0.61. The correlation coefficient here is lower that the same 
comparison done with the previous simulation (between panels B 
and F in Fig. 4) due the relatively small size of the simulation box 
where the number of large-scales modes is smaller. 

This conclusion is insensitive to the type of source we assume 
to power reionization, i.e., thermal or power-law. This is reassuring 
and indicates that this effect is driven more by the very large-scale 
structure than by the details of the reionization process. It should be 
emphasized here that with higher resolution the increased number 
of low-mass sources might slightly change the picture, but not in 
a drastic way, as already seen in very large-scale high-resolution 
simulations (Iliev & Mellema, private communication). 
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Figure 8. The very central area of the LOFAR core, the "superterp". The left panel shows a picture of the superterp that was taken in the spring of 201 1 (credit 
for the picture goes to "TopFoto, Assen"). The right side shows a sketch of the layout of the stations in the superterp. The large blue filled circles indicate the 
locations of the existing LOFAR 6 Low Band Antenna (LBA) stations in the superterp. The red filled circles show the locations of the existing 12 High Band 
Antenna (HBA) stations, whereas the pink filled circles show a possible configuration for an additional 12 HBA stations which would double the collecting 
area. The sizes of the stations are to scale. The addition of 12 HBA stations would significantly increase the sensitivity of LOFAR on large scales. 
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Figure 7. EoR maps at redshift 7 produced by the BEARS algorithm, with a 
(%Hl) =0.5 field of view of 2.5° X 2.5°. The left panel shows the original 
20 arcmin. smoothed simulation assuming 2400 hours of Observation with 
LOFAR. The right panel shows the extracted image after adding noise and 
foregrounds also smoothed with 20 arcmin. Gaussian. The two maps are 
clearly very similar with a correlation coefficient of pb 0.61. 



Figure 6. The phases of the reconstructed images (maps C-F in Figure 4) 
versus the phases of the original map (map B in Figure 4). The plots are 
shown as density plots where the density represents the number of point 
per unit area, hence, highly correlated maps should show as high density 
contours at the diagonal. The map shows the highest 67% of the density 
PDF. The high correlations at the upper left side and lower right side of the 
figures simply reflect the periodicity of the phases. 



5 ENHANCING THE LARGE-SCALE IMAGING 
CAPABILITIES OF LOFAR 

Clearly, the best imaging quality is obtained when one assumes a 
very large amount of observing time (2400 hours) focusing on one 
single field (Fig. 4). Obviously, this is a vast amount of telescope 
time that is hard to accumulate especially on open time telescopes, 



such as LOFAR will become within a number of years. Hence, in 
what follows we show that a relatively inexpensive modification of 
the LOFAR telescope can significantly enhance its sensitivity on 
large scales. 

Increasing the signal-to-noise of the data on scales 
£ 30 arcmin, corresponding to a comoving scales of 
Si 60 h~ comoving Mpc at high redshifts, will allow imaging 
the EoR, especially the last phases of reionization ({xhi) ^ 0.5) 
within a very reasonable amount of observational time. The quality 
of imaging on such angular scales for a 2 meter wavelength de- 
pends on the number of baselines with length of £ 200 meters, 
and hence on the number of stations within the very central area, 
i.e., around and within the so-called superterp of LOFAR. For clar- 
ity, High Band Antenna (HBA) station here means a collection of 
24 HBA tiles with each tile having 4x4 antennas (dipoles). The su- 
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Figure 9. The uv-plane covered by the current HBA stations after 6 hours of observation at a declination of +48° . The left panel shows the uv-plane obtained 
with the current superterp HBA stations whereas the right panel shows the uv-plane with the proposed additional 12 HBA stations. 



perterp, which in Dutch means "super mound", is shown in the left 
panel of Fig. 8 and is a circular area with a radius of about 150 m 
(baselines up to 300 m). In this area there are currently about 12 
HBA stations of LOFAR that operate in the frequency range of 1 15- 
230 MHz. Below we consider the effect of increasing the number 
of HBA stations in the superterp as a possible way to cut the obser- 
vation time to reach the desired sensitivity for imaging. The right 
panel of Fig. 8, shows a possible configuration of 24 stations (pink 
filled circles) in the superterp - the current 12 stations (red filled cir- 
cles) and 12 new stations (pink filled circles). This would increase 
the number of baselines within the superterp and the area immedi- 
ately surrounding it by a factor of four. The signal-to-noise for the 
same time of observation would therefore be enhanced by a factor 
of, at least, two, .i.e., reducing the integration time by a factor of 
four as well. An addition of HBA stations would not be too costly 
in relative terms since most of the infra structure needed for rolling 
out the added station will be relatively small. Fig. 9 shows the im- 
provement in the uv coverage for the case of a 24 HBA stations in 
the superterp (right panel) relative to that of the current 12 HBA 
stations (left panel). Such an enhancement, which is relatively easy 
and cheap to obtain, would increase the sensitivity of LOFAR on 
these scales by a factor of two, thus boosting the LOFAR imaging 
capability at large scales. Such an enhancement would enable the 
LOFAR-EoR project to image the reionization process on large- 
scales rather than only detecting it statistically. Another issue that 
we are studying is the possibility to correlate all the tiles within 
the superterp, instead of whole stations, which will add many base- 
lines to the measurement. This extension proposal has to be studied 
in more detail before accurate estimations of its ramifications can 
be appreciated. 



6 CONCLUSIONS & DISCUSSION 

A new generation of low frequency interferometers has recently 
come on line (LOFAR, MWA, PAPER, SKA). Exploring the EoR 
is a major science driver for these telescopes. The current common 



wisdom in the field is that these telescopes will detect the EoR sta- 
tistically but that they will not be able to image it, due to their poor 
sensitivity. The low sensitivity affects both small scales and large 
scales. The influence of the noise on small scales is quite clear; its 
influence on large scales is indirect and has to do with the abil- 
ity to fit the foregrounds well along the frequency direction. Since 
this fitting is done for each image or uv plane pixel along the fre- 
quency direction, the accuracy of the fit will depend sensitively on 
the noise level in the data. Hence, the low signal-to-noise will de- 
crease the quality of the foreground fit, especially on large scale 
where the foregrounds power becomes larger. With the projected 
level of noise per resolution element in the current generation ex- 
periments it is no wonder that imaging has been deemed possible 
only with future instruments such as the Square Kilometer Array 
(see e.g., Zaroubi 2010). 

In this paper we have shown that imaging of the neutral IGM at 
the later stages of reionization with the current generation of radio 
interferometers is possible on very large scales ( ^ 0.5°). The ex- 
istence of very large-scale neutral regions towards the later phases 
of the reionization process and their large contrast with the ionized 
regions enhances the EoR signal in two ways: firstly, the large neu- 
tral regions increase the amplitude beyond that expected from mere 
cosmological density fluctuations, secondly, their large size gives 
a coherent signature over about a hundred resolution elements for 
LOFAR, which overcomes the poor signal-to-noise ratio in each of 
them. These two effects make it possible in principle to image the 
reionization process on large scales. 

A simple argument in support of our conclusion could also be 
cast as follows. The ionizating sources are preferentially located in 
the high-density regions which typically ionize before the low den- 
sity regions. Since the density fluctuation power spectrum peaks at 
120 h" 1 comoving Mpc this will be roughly the scale of the ion- 
ized and neutral regions at the midpoint of reionization. It is this 
scale that essentially allows us to image the EoR on large scales 
with instruments like LOFAR. Notice that this argument holds, al- 
most, regardless of the type of reionization sources. 
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The second issue we address here, is the issue of the influence 
of the foregrounds on the extracted EoR signal, especially on large 
scales. We show that for realistic foregrounds and noise models 
current state-of-the-art extraction techniques do very well even on 
scales in excess of a degree, which have so far been thought so far to 
be inaccessible to the current generation of experiments. This again 
demonstrates that imaging of the EoR on large scales is possible 
with LOFAR. We have also shown that a modest enhancement of 
the capabilities of LOFAR at the central area of the core, known 
as the superterp, would greatly boost the possibility of imaging the 
EoR on large scales. 

The importance of imaging the EoR with current telescopes 
cannot be overstated. Astrophysically, imaging would allow ad- 
dressing a large number of issues that would otherwise be difficult 
to deal with. For example, it would make it possible to identify 
spatially where the ionized regions are and hence it would allow 
targeting these regions with follow up optical and infrared surveys, 
and iterating on the foregrounds' subtraction (Petrovic & Oh 2010). 
However, given the issues that face the current experiments, imag- 
ing will be of utmost importance in discovering and addressing sys- 
tematic effects whose existence would otherwise be impossible to 
realize. In other words, imaging of the EoR would boost our con- 
fidence in the reliability of the measured signal and the properties 
attributed to it. Conversely, if the measured power spectrum indi- 
cates the existence of EoR power on very large-scales, then the 
availability of images on such scales would allow us to determine 
whether this is a result of the cosmological signal or of systematic 
effects that have not yet been brought under control. 

In the future, the Square Kilometer Array (SKA) will provide 
enough signal-to-noise to image the EoR with very high accuracy 
on scales up to 5° and down to scales of the order of 1 arcmin. This 
obviously will surpass LOFAR's performance. SKA will also be 
able to go to much lower frequencies (down to 50 MHz) enabling a 
direct observation of the Universe's Dark Ages. However, SKA will 
still take about a decade to become operational whereas LOFAR 
and the other current instruments are already here, and can make 
significant scientific discoveries that can be enhanced even further 
with relatively little extra cost. 
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